Goto

Collaborating Authors

 mrc dataset


ViQA-COVID: COVID-19 Machine Reading Comprehension Dataset for Vietnamese

Nguyen-Phung, Hai-Chung, Lê, Ngoc C., Nguyen, Van-Chien, Nguyen, Hang Thi, Nguyen, Thuy Phuong Thi

arXiv.org Artificial Intelligence

After two years of appearance, COVID-19 has negatively affected people and normal life around the world. As in May 2022, there are more than 522 million cases and six million deaths worldwide (including nearly ten million cases and over forty-three thousand deaths in Vietnam). Economy and society are both severely affected. The variant of COVID-19, Omicron, has broken disease prevention measures of countries and rapidly increased number of infections. Resources overloading in treatment and epidemics prevention is happening all over the world. It can be seen that, application of artificial intelligence (AI) to support people at this time is extremely necessary. There have been many studies applying AI to prevent COVID-19 which are extremely useful, and studies on machine reading comprehension (MRC) are also in it. Realizing that, we created the first MRC dataset about COVID-19 for Vietnamese: ViQA-COVID and can be used to build models and systems, contributing to disease prevention. Besides, ViQA-COVID is also the first multi-span extraction MRC dataset for Vietnamese, we hope that it can contribute to promoting MRC studies in Vietnamese and multilingual.


Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets

Foolad, Shima, Kiani, Kourosh, Rastgoo, Razieh

arXiv.org Artificial Intelligence

This paper provides a thorough examination of recent developments in the field of multi-choice Machine Reading Comprehension (MRC). Focused on benchmark datasets, methodologies, challenges, and future trajectories, our goal is to offer researchers a comprehensive overview of the current landscape in multi-choice MRC. The analysis delves into 30 existing cloze-style and multiple-choice MRC benchmark datasets, employing a refined classification method based on attributes such as corpus style, domain, complexity, context style, question style, and answer style. This classification system enhances our understanding of each dataset's diverse attributes and categorizes them based on their complexity. Furthermore, the paper categorizes recent methodologies into Fine-tuned and Prompt-tuned methods. Fine-tuned methods involve adapting pre-trained language models (PLMs) to a specific task through retraining on domain-specific datasets, while prompt-tuned methods use prompts to guide PLM response generation, presenting potential applications in zero-shot or few-shot learning scenarios. By contributing to ongoing discussions, inspiring future research directions, and fostering innovations, this paper aims to propel multi-choice MRC towards new frontiers of achievement.


Transfer Learning Enhanced Single-choice Decision for Multi-choice Question Answering

Cui, Chenhao, Jiang, Yufan, Wu, Shuangzhi, Li, Zhoujun

arXiv.org Artificial Intelligence

Multi-choice Machine Reading Comprehension (MMRC) aims to select the correct answer from a set of options based on a given passage and question. The existing methods employ the pre-trained language model as the encoder, share and transfer knowledge through fine-tuning.These methods mainly focus on the design of exquisite mechanisms to effectively capture the relationships among the triplet of passage, question and answers. It is non-trivial but ignored to transfer knowledge from other MRC tasks such as SQuAD due to task specific of MMRC.In this paper, we reconstruct multi-choice to single-choice by training a binary classification to distinguish whether a certain answer is correct. Then select the option with the highest confidence score as the final answer. Our proposed method gets rid of the multi-choice framework and can leverage resources of other tasks. We construct our model based on the ALBERT-xxlarge model and evaluate it on the RACE and DREAM datasets. Experimental results show that our model performs better than multi-choice methods. In addition, by transferring knowledge from other kinds of MRC tasks, our model achieves state-of-the-art results in both single and ensemble settings.


A Comprehensive Survey on Multi-hop Machine Reading Comprehension Datasets and Metrics

Mohammadi, Azade, Ramezani, Reza, Baraani, Ahmad

arXiv.org Artificial Intelligence

Abstract: Multi-hop Machine reading comprehension is a challenging task with aim of answering a question based on disjoint pieces of information across the different passages. The evaluation metrics and datasets are a vital part of multi-hop MRC because it is not possible to train and evaluate models without them, also, the proposed challenges by datasets often are an important motivation for improving the existing models. Due to increasing attention to this field, it is necessary and worth reviewing them in detail. This study aims to present a comprehensive survey on recent advances in multi-hop MRC evaluation metrics and datasets. In this regard, first, the multi-hop MRC problem definition will be presented, then the evaluation metrics based on their multi-hop aspect will be investigated. Also, 15 multi-hop datasets have been reviewed in detail from 2017 to 2022, and a comprehensive analysis has been prepared at the end. Finally, open issues in this field have been discussed. Keywords: Multi-hop Machine Reading Comprehension, Multi-hop Machine Reading Comprehension Dataset, Natural Language Processing, 1-INTRODUCTION Machine reading comprehension (MRC) is one of the most important and long-standing topics in Natural Language Processing (NLP). MRC provides a way to evaluate an NLP system's capability for natural language understanding. An MRC task, in brief, refers to the ability of a computer to read and understand natural language context and then find the answer to questions about that context. The emergence of large-scale single-document MRC datasets, such as SQuAD (Rajpurkar et al., 2016), CNN/Daily mail (Hermann et al., 2015), has led to increased attention to this topic and different models have been proposed to address the MRC problem, such as (D. However, for many of these datasets, it has been found that models don't need to comprehend and reason to answer a question. For example, Khashabi et al (Khashabi et al., 2016) proved that adversarial perturbation in candidate answers has a negative effect on the performance of the QA systems. Similarly, (Jia & Liang, 2017) showed that adding an adversarial sentence to the SQuAD (Rajpurkar et al., 2016) context will drop the result of many existing models.


IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension

Putri, Rifki Afina, Oh, Alice

arXiv.org Artificial Intelligence

Machine Reading Comprehension (MRC) has become one of the essential tasks in Natural Language Understanding (NLU) as it is often included in several NLU benchmarks (Liang et al., 2020; Wilie et al., 2020). However, most MRC datasets only have answerable question type, overlooking the importance of unanswerable questions. MRC models trained only on answerable questions will select the span that is most likely to be the answer, even when the answer does not actually exist in the given passage (Rajpurkar et al., 2018). This problem especially remains in medium- to low-resource languages like Indonesian. Existing Indonesian MRC datasets (Purwarianti et al., 2007; Clark et al., 2020) are still inadequate because of the small size and limited question types, i.e., they only cover answerable questions. To fill this gap, we build a new Indonesian MRC dataset called I(n)don'tKnow- MRC (IDK-MRC) by combining the automatic and manual unanswerable question generation to minimize the cost of manual dataset construction while maintaining the dataset quality. Combined with the existing answerable questions, IDK-MRC consists of more than 10K questions in total. Our analysis shows that our dataset significantly improves the performance of Indonesian MRC models, showing a large improvement for unanswerable questions.


PQuAD: A Persian Question Answering Dataset

Darvishi, Kasra, Shahbodagh, Newsha, Abbasiantaeb, Zahra, Momtazi, Saeedeh

arXiv.org Artificial Intelligence

It includes 80,000 questions along with their answers, with 25% of the questions being adversarially unanswerable. We examine various properties of the dataset to show the diversity and the level of its difficulty as a MRC benchmark. By releasing this dataset, we aim to ease research on Persian reading comprehension and development of persian question answering systems. Our experiments on different state-of-the-art pre-trained contextualized language models shows 74.8% Exact Match (EM) and 87.6% F1-score that can be used as the baseline results for further research on Persian QA.


More Than Reading Comprehension: A Survey on Datasets and Metrics of Textual Question Answering

Bai, Yang, Wang, Daisy Zhe

arXiv.org Artificial Intelligence

Textual Question Answering (QA) aims to provide precise answers to user's questions in natural language using unstructured data. One of the most popular approaches to this goal is machine reading comprehension(MRC). In recent years, many novel datasets and evaluation metrics based on classical MRC tasks have been proposed for broader textual QA tasks. In this paper, we survey 47 recent textual QA benchmark datasets and propose a new taxonomy from an application point of view. In addition, We summarize 8 evaluation metrics of textual QA tasks. Finally, we discuss current trends in constructing textual QA benchmarks and suggest directions for future work.


Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources

Zhang, Taolin, Wang, Chengyu, Qiu, Minghui, Yang, Bite, He, Xiaofeng, Huang, Jun

arXiv.org Artificial Intelligence

Machine Reading Comprehension (MRC) aims to extract answers to questions given a passage. It has been widely studied recently, especially in open domains. However, few efforts have been made on closed-domain MRC, mainly due to the lack of large-scale training data. In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving. A high-quality dataset is manually constructed for the purpose, named Multi-task Chinese Medical MRC dataset (CMedMRC), with detailed analysis conducted. We further propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained language models by the dynamic fusion mechanism of heterogeneous features and the multi-task learning strategy. Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.


A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics, and Benchmark Datasets

Zeng, Chengchang, Li, Shaobo, Li, Qin, Hu, Jie, Hu, Jianjun

arXiv.org Artificial Intelligence

Machine Reading Comprehension (MRC) is a challenging NLP research field with wide real world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed the human performance on many datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need of improving existing datasets, evaluation metrics and models to move the MRC models toward 'real' understanding. To address this lack of comprehensive survey of existing MRC tasks, evaluation metrics and datasets, herein, (1) we analyzed 57 MRC tasks and datasets; proposed a more precise classification method of MRC tasks with 4 different attributes (2) we summarized 9 evaluation metrics of MRC tasks and (3) 7 attributes and 10 characteristics of MRC datasets; (4) We also discussed some open issues in MRC research and highlight some future research directions. In addition, to help the community, we have collected, organized, and published our data on a companion website(https://mrc-datasets.github.io/) where MRC researchers could directly access each MRC dataset, papers, baseline projects and browse the leaderboard.


BIOMRC: A Dataset for Biomedical Machine Reading Comprehension

Stavropoulos, Petros, Pappas, Dimitris, Androutsopoulos, Ion, McDonald, Ryan

arXiv.org Machine Learning

We introduce BIOMRC, a large-scale cloze-style biomedical MRC dataset. Care was taken to reduce noise, compared to the previous BIOREAD dataset of Pappas et al. (2018). Experiments show that simple heuristics do not perform well on the new dataset, and that two neural MRC models that had been tested on BIOREAD perform much better on BIOMRC, indicating that the new dataset is indeed less noisy or at least that its task is more feasible. Non-expert human performance is also higher on the new dataset compared to BIOREAD, and biomedical experts perform even better. We also introduce a new BERT-based MRC model, the best version of which substantially outperforms all other methods tested, reaching or surpassing the accuracy of biomedical experts in some experiments. We make the new dataset available in three different sizes, also releasing our code, and providing a leaderboard.